perm filename CMP.SJU[P,JRA] blob
sn#155752 filedate 1975-04-22 generic text, type C, neo UTF8
COMMENT ⊗ VALID 00003 PAGES
C REC PAGE DESCRIPTION
C00001 00001
C00002 00002 STRINGS
C00007 00003 General notes.
C00009 ENDMK
C⊗;
STRINGS
Many high-level languages include operations on objects known as
strings. We define a string to be a linear sequence (of arbitrary
length) consisting of uppercase letters or digits. Thus AB or A123
or 01A23 are strings. We will blur the distinction between single
characters and the string consisting of the single character. Thus A
is a string. To give completeness, we include the empty string, ε.
We wish to implement strings as a structure in a language and wish to
include a rich complement of operations on such strings.
First we wish to be able to assign strings to string variables. Thus
x ← ABC is to mean that the variable, x, has been assigned a string
ABC.
Other operations are:
concat[x;y]: x and y are strings; this operation has as value a
representation of the concatenation of x with y. Thus
concat[ABC;DE] gives ABCDE. concat[ε;x] = concat[x;ε] = x.
first[x]: x is a non-empty string; first[x] gives the first element
of x. Thus first[ABC] gives A.
rest[x]: x is a non-empty string; rest[x] gives all but the first
element of x. Thus rest[ABC] = BC; rest[A]= ε, the empty string;
and rest[ε] is undefined.
I. Implementation. We will have an area of memory (String Space)
which is best visualized as a sequence of cells. Each cell is
capable of containing one character.
String variables will be accessed through a symbol table. For a
non-empty string variable,a symbol table entry will consist of a
pointer into string space and a character-count. The pointer will
point to the first character of the designated string and the
character count will determine the number of characters which are to
be included in the string. Thus:
x: ( 3 ↓ ) means x has value DEF.
→→→→→→→→→↓
↓
... A B C D E F ..... string space
Describe implementations of the operations concat, first, and rest.
In describing concat, be sure to take into account the problems of
storage management for String Space. Things which might flash before
your eyes might be: garbage collection; reference counters; sharing
of values; copying of values.
II. Other operations: How would you implement a function to give the
length of a string? What is the test for the empty string? What
about equality of strings? (hint: are strings stored uniquely? if
so, why; if not, why not)
subst[x;y;z] means "substitute the string x for the first occurrence
(in left to right order)of the string y in the string z". Thus
subst[AB; 01; AC0101] gives ACAB01. Analyze the problems of
implementing subst. For example, does the current representation help
or hurt. Do you modify the original value of z, or what?
General notes.
I would suggest that you read each problem through carefully before
you begin any writing. The problems are not meant to be tricky, but
care is required.
POSTFIX EVALUATION
We wish to evaluate expressions given in a postfix parenthesis-free
notation. Assume for this part of the problem that the expressions
contain only the binary operators for plus and times and contain only
numeric constants. You may further assume that the expressions are
well formed.
Thus anomalies like "23+*" or "234+" will not occur.
I. Describe a representation for such expressions and give a precise
algorithm for their evaluation.
II. Assume that we wish to extend this algorithm to handle
i. ill-formed input (therefore to produce error messages)
ii. operators other than plus and times,
iii. operators other than binary ones.
Describe appropriate modifications to your algorithm and/or data
representation to handle these three new cases.